Program Optimization Based on Compile-Time Cache Performance Prediction

نویسندگان

  • Wesley K. Kaplow
  • Boleslaw K. Szymanski
چکیده

We present a novel, compile-time method for determining the cache performance of the loop nests in a program. The cache-miss rates are produced by applying the program's reference string of a loop nest, determined during compilation, to an architecturally parameterized cache simulator. The obtained cache-miss rates correlate well with the performance of the loop nests on actual target machines. We describe also a heuristic that uses this method for compile-time optimization of loop ranges in iteration-space blocking. The results of the loop program optimizations are presented for diierent processor architectures, namely IBM SP1 RS/6000, the SuperSPARC, and the Intel i860.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parallel Processing Letters Program Optimization Based on Compile-time Cache Performance Prediction

We present a novel, compile-time method for determining the cache performance of the loop nests in a program. The cache hit-rates are produced by applying the reference string, determined during compilation, to an architecturally parameterized cache simu-lator. We also describe a heuristic that uses this method for compile-time optimization of loop ranges in iteration-space blocking. The result...

متن کامل

Design and Implementation of a Lightweight Dynamic Optimization System

Many opportunities exist to improve micro-architectural performance due to performance events that are difficult to optimize at static compile time. Cache misses and branch mis-prediction patterns may vary for different micro-architectures using different inputs. Dynamic optimization provides an approach to address these and other performance events at runtime. This paper describes a software s...

متن کامل

Parallel Processing Letters C World Scientiic Publishing Company Tiling for Parallel Execution { Optimizing Node Cache Performance

Tiling has been used by parallelizing compilers to deene ne-grain parallel tasks and to optimize cache performance. In this paper we present a novel compile-time technique, called miss-driven cache simulation, for determining tile size that achieves the highest cache hit-rate. The widening disparity between the processor's peak instruction rate and the main memory access time in modern processo...

متن کامل

A Study of the Performance Potential for Dynamic Instruction Hints Selection

Instruction hints have become an important way to communicate compile-time information to the hardware. They can be generated by the compiler and the post-link optimizer to reduce cache misses, improve branch prediction and minimize other performance bottlenecks. This paper discusses different instruction hints available on modern processor architectures and shows the potential performance impa...

متن کامل

A Stochastic Approach to Instruction Cache Optimization

The memory alignment of executable code can have significant yet hard to predict effects on the run-time performance of programs, even in the presence of existing aggressive optimization. We present an investigation of two different approaches to instruction cache optimization—the first is an implementation of a well-known and established compile-time heuristic, and the second is our own stocha...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Parallel Processing Letters

دوره 6  شماره 

صفحات  -

تاریخ انتشار 1996